Efficient Computation and Visualization of Multiple Density-Based Clustering Hierarchies
نویسندگان
چکیده
HDBSCAN*, a state-of-the-art density-based hierarchical clustering method, produces organization of clusters in dataset w.r.t. parameter mpts. While small change mpts typically leads to the structure, choosing “good” value can be challenging: depending on data distribution, high or low may more appropriate, and certain reveal themselves at different values. To explore results for range values, one has run HDBSCAN* each independently, which computationally impractical. In this paper, we propose an approach efficiently compute all hierarchies values by building upon from computational geometry replace HDBSCAN*'s complete graph with smaller equivalent graph. An experimental evaluation shows that our obtain over hundred cost running about twice, corresponds speedup than 60 times, compared independently many times. We also series visualizations allow users analyze collection along case studies illustrate how these analyses are performed.
منابع مشابه
Efficient Anytime Density-based Clustering
Many clustering algorithms suffer from scalability problems on massive datasets and do not support any user interaction during runtime. To tackle these problems, anytime clustering algorithms are proposed. They produce a fast approximate result which is continuously refined during the further run. Also, they can be stopped or suspended anytime and provide an answer. In this paper, we propose a ...
متن کاملAnimated visualization of multiple intersecting hierarchies
We describe a new information structure composed of multiple intersecting hierarchies, which we call a Polyarchy. Visualizing polyarchies enables use of novel views for discovery of relationships which are very difficult using existing hierarchy visualization tools. This paper will describe the visualization design and system architecture challenges as well as our current solutions. Visual Pivo...
متن کاملDensity-Based Method for Clustering and Visualization of Complex Data
In this paper the topic of clustering and visualization of the data structure is discussed. Authors review currently found in literature algorithmic solutions that deal with clustering large volumes of data, focusing on their disadvantages and problems. What is more the authors introduce and analyze a density-based algorithms called OPTICS (Ordering Points To Identify the Clustering Structure) ...
متن کاملImprovement of density-based clustering algorithm using modifying the density definitions and input parameter
Clustering is one of the main tasks in data mining, which means grouping similar samples. In general, there is a wide variety of clustering algorithms. One of these categories is density-based clustering. Various algorithms have been proposed for this method; one of the most widely used algorithms called DBSCAN. DBSCAN can identify clusters of different shapes in the dataset and automatically i...
متن کاملDenPEHC: Density peak based efficient hierarchical clustering
Existing hierarchical clustering algorithms involve a flat clustering component and an additional agglomerative or divisive procedure. This paper presents a density peak based hierarchical clustering method (DenPEHC), which directly generates clusters on each possible clustering layer, and introduces a grid granulation framework to enable DenPEHC to cluster large-scale and high-dimensional (LSH...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2021
ISSN: ['1558-2191', '1041-4347', '2326-3865']
DOI: https://doi.org/10.1109/tkde.2019.2962412